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Method and apparatus for decoding a data stream in streaming 
systems 

This invention relates to a method and apparatus for decoding 
5 a data stream in a buffering node for multimedia streaming 
systems , like MPEG-4. 

Background 

10 In the MPEG-4 standard 1SO/1BC 14496 f in particular in part 1 
Systems, an audio/video (AV) scene can be composed from 
several audio, video and synthetic 2D/3D objects that can be 
coded with different MPEG-4 format coding types and can be 
transmitted as binary compressed data in a multiplexed 

15 bitstream comprising multiple substreams. A substream is also 
referred to as Elementary Stream (BS) , and can be accessed 
through a descriptor. ES can contain AV data, or can be so- 
called Object Description. (0D) streams, which contain, 
configuration information necessary for decoding the AV 

20 . substreams. The process of synthesizing a single scene from 
the component objects is called composition, and means mixing 
multiple individual AV objects, e.g. a presentation of a video 
- with related audio and text, after reconstruction of packets 
and separate decoding of their respective ES . The composition 

25 of a scene is described in a dedicated ES called % Scene 
Description Stream' , which contains a scene description 
consisting of an encoded tree of nodes called Binary 
Information For Scenes (BIPS) . *Node' means a processing step 
or unit used in the MPEG-4 standard, e.g. an interface that 

30 buffers data or carries out time synchronization between a 
decoder and subsequent processing units. Nodes can have 
attributes, referred to as fields, and other information 
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attached. A leaf node in the BIPS tree corresponds to 
elementary AV data by pointing to an OD within the OD stream, 
which in turn contains an ES descriptor pointing to AV data in 
an Es. Intermediate nodes, or scene description nodes, group 
this material to form AV objects, and perform e.g. grouping 
and transformation on such AV objects. In a receiver the 
configuration eubstreams are extracted and used to set up the 
required AV decoders. The AV substreams are decoded separately 
to objects, and the received composition instructions are used 
to prepare a single presentation from the decoded AV objects. 
This final presentation, or scene is then played back. 

According to the MPEG-4 standard, audio content can only be 
stored in the ^audioBuf f er' node or in the *mediaBuf fer' node. 
Both notes are able to store a single data block at a time. 
When storing another data block, the previously stored data 
block is overwritten. 

The 'audioBuffer' node can onlybe loaded with data from the 
audio substream when the node is created, or when the 'length' 
field is changed. This means that the audio buffer can only be 
loaded with one continuous block of audio data. The allocated 
memory matches the specified amount of data. Further, it may 
happen that the timing of loading data samples ia not exactly 
due to the timing model of the BIPS decoder. 

For loading more than one audio sample, it is possible to 
build up an MPEG-4 scene using multiple 'audioBuf fer' nodes. 
But it is difficult to handle the complexity of the scene, and 
to synchronize the data stored in the different 'audioBuf fer' 
nodes. Additionally, for each information a new stream has to 
be opened. 
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Summary of the Invention 

The problem to be solved by the invention is to improve 
storage and retrieval of single or multiple data blocks in 
5 .multimedia buffer nodes in streaming systems, like MPEG-4. 

This problem is solved by the present invention as disclosed 
in claim 1. An apparatus using the inventive method is 
disclosed in claim 8. 

10 

According to the invention, additional parameters are added to 
the definition of a multimedia buffer node, e.g. audio or 
video node, so that multiple data blocks with AV contents can 
be stored and selectively processed, e.g. included into a 

15 scene, updated or deleted. In the case of MPEG-4 these 

additional parameters are new fields in the description of a 
node, e.g. in the *audioBuf f sr' node or 'mediaBuf fer' node. 
The new fields define the position of a data block within a 
. received data stream, e.g. audio stream, and how to handle the 

20 loading of this block, e.g. overwriting previously stored data 
blocks or accumulating data blocks in a buffer. 

Brief description of the drawings 

25 Exemplary embodiments of the invention are described with 
reference to the accompanying drawings, which show in 

Pig.l the general structure of an MPEG-4 scene; 

30 Pig. 2 an exemplary % AdvancedAudioBuf f er' node for MPEG-4; and 

Pig. 3 the fields within an exemplary v AdvancedAudioBuffer' 
node for MPEG-4. 
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Detailed description of the invention 



Fig.l shows the composition of an MPEG-4 scene, using a scene 
description received in a scene description stream ES_ID S . The 
5 scene comprises audio, video and other data, and the audio and 
video composition is defined in an AV node ODID Av . The audio 
part of the scene is composed in an audio compositor, which 
includes an AdvancedAudioBuf f er node and contains a reference 
ODID A to an audio object, e.g. decoder. The actual audio data 
10 belonging to this audio object are contained as packets in an 
ES, namely the audio stream, which is accessible through its 
descriptor ESJD A . The AdvancedAudioBuf f er node may pick out 
multiple audio data packets from the audio stream ES_ID A coming 
.from an audio decoder. 

15 

The audio part of an MPEG-4 scene is shown in more detail in 
Pig. 2. The audio part of a scene description 10 contains a 
sound node 11 that has an AdvancedAudioBuf fer node 12, 
providing an interface for storing audio data. The audio data 

20 to be stored consist of packets within the audio stream 14, 

which is received from an audio decoder. For each data packet 
is specified at which time it is to be decoded. The 
AdvancedAudioBuffer node 12 holds the time information for the 
packets to load, e.g. start time t x and end time t 2 . Further, 

25 it can - identify and access the required ES by referring to an 
AudioSource node 13. The AdvancedAudioBuffer node may buffer 
the specified data packet without overwriting previously 
received data packets, as long as it has sufficient buffer 
capacity. 

30 

The AdvancedAudioBuffer node 12 can be used instead of the 
AudioBuffer node defined in subclause 9.4.2.7 of the MPEG-4 
systems standard ISO/IEC 14496-1:2002. As compared to the 
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AudioBuffer node, the inventive AdvancedAudioBuf fer node has 
an enhanced load mechanism that allows e*g. reloading of data. 

The AdvancedAudioBuf fer node can be defined using the MPEG-4 

5 syntax, as shown in Pig* 3. it contains a number of fields and 
events. Fields have the function of parameters or variables, 
while events represent a control interface to the node. The 
function of the following fieldB is described in ISO/IBC 
14496-1:2002, subclause 9.4.2.7s 'loop', 'pitch', 'startTime' , 

10 v stopTime' , 'children', l numChan' , 'phaseGroup' , , 'length' 
'duration changed' and 'isActive' . The * length' field 
specifies the length of the allocated audio buffer in seconds. 
In the current version of the mentioned standard this field 
cannot be modified. This means that another AudioBuffer node 

15 must be instantiated when another audio data block shall be 
loaded, since audio data is buffered at the instantiation of 
the node. But the creation of a new node is a rather complex 
software process, and may result in a delay leading to 
differing time references in the created nodle and the BIPS 

20 tree . 

The following new fields, compared to the AudioBuffer node, 
are included in the AdvancedAudioBuf fer nodes 'startLoadTime' , 
' stopLoadTime ' , *loadMode' , *numAccumulatedBlocks' , 

25 'deleteBlock' and 'playBlock' . With these new fields it is 

possible to enable new functions, e.g. lo^d and delete stored 
data. Further, it is possible to define at node instantiation 
time the buffer size to be allocated, independently from the 
actual amount of data to be buffered. The buffer size to be 

30 allocated is specified by the * length' field. The *startTime' 
and 'stopTime' fields can be used alternatively to the 
'startLoadTime' and 'stopLoadTime' fields, depending on the 
mode described in the following. 
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Different load mechanisms may exist, which are specified by 
the field 1 loadMode'. The different load modes are e.g. 
Compatibility mode/ Reload mode, Accumulate mode, Continuous 
Accumulate mode and Limited Accumulate mode. 

In Compatibility mode, audio data shg.ll be buffered at the 
instantiation of the AdvancedAudioBuf fer node, and whenever 
the length field changes. The x startLoadTime' , v stopLoadTime ' , 
*numAccumulatedBlocks' , *deleteBlock' and % playBlock' fields 
have no effect in this mode. The % startTime' and ^stopTime' 
fields specify the data block to be buffered. 

In Reload mode r the x startLoadTime' and * stopLoadTime' fields 
are valid. When the time reference of the AdvancedAudioBuf fer 
node reaches the time specified in the % BtartLoadTime ' field, 
the internal data buffer is cleared and the samples at the 
input of the node are stored until value in the 1 stopLoadTime' 
field is reached, or the stored data have the length defined 
in the % length' field. If the x startLoadTime' value is higher 
or equal to the * stopLoadTime' value, a data block with the 
length defined in the y length' field will be loaded at the 
time specified in % startLoadTime ' . The ^nuraAccumulatedBlocks' , 
'deleteBlock' and *playBlock f fields have no effect in this 
mode. 

"In the Accumulate mode a data block defined by the interval 
between the * startLoadTime' and % stopLoadTime' field values is 
appended at the end of the buffer contents. In order to have 
all data blocks accessible, the blocks are indexed, or 
labeled, as described below. When the limit defined by the 
* length' field is reached, loading is finished. The field 
v riumAccumulatedBlocks' has no effect in this mode, 
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In the Continuous Accumulate mode a data block defined by the 
interval between the 4 startLoadTime' and »stopLoadTime' field 
values is appended at the end of the buffer contents . All data 
5 blocks in the buffer are indexed to be addressable, as 

described before. When the limit defined by the x length' field 
is reached, the oldest stored data may be discarded, or 
overwritten. The field 'numAccumulatedBlocks' has no effect in 
this mode. 

10 

In the Limited Accumulate mode is similar to the Accumulate 
mode, except that the number of stored blocks is limited to 
the number specified in the % numAccumulatedBlocks' field- In 
this mode, the 'length' field has no effect. 

15 

For some of the described load mechanisms, a transition from 0 
to a value below 0 in the 'deleteBlock' field starts deleting 
of a data block, relative to the latest data block. The latest 
block is addressed with -1, the block before it with -2 etc. 
20 This is possible e.g. in the following load modes s Accumulate 
mode, Continuous Accumulate mode and Limited Accumulate mode. 

Since the inventive buffer may hold several data blocks, it is 
advantageous to have a possibility to select a particular data 

25 block for reproduction. The *playBlock' field defines the 

block to be played. If the 'playBlock' field is set to 0, as 
is done by default, the. whole content will be played, using 
the »startTime' and 'stopTime' conditions. This is the above- 
mentioned Compatibility mode, since it is compatible to the 

30 function of the known MPEG-4 system. A negative value of 

'playBlock' addresses a block relative to the latest block, 
e.g. the latest block is addressed with -1, the previous block 
with -2 etc. 
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It is an advantage of the inventive method that a buffer node 
can be reused, since loading data to the node is faster than 
in the current MPEG-4 standard, where a new node has to be 
created before data can be buffered. Therefore it is easier 
for the AdvancedAudioBuf fer node to match the timing reference 
of the BIPS node, and thus synchronize e.g. audio and video 
data in MPEG- 4. 

An exemplary application for the invention is a receiver that 
receives a broadcast program stream containing various 
different elements, e.g. traffic information. Prom the audio 
stream, the packets with traffic information are extracted. 
With the inventive MPEG-4 system it is possible to store these 
packets, which are received die continuously at different 
times, in the receiver in a way that they can be accumulated 
in its buffer, and then presented at a user defined time. E.g. 
the user may have an interface to call the latest traffic 
information message at any time, or filter or delete traffic 
information messages manually or automatically. On the other 
hand, also the broadcaster can selectively delete or update 
traffic information messages that are already stored in the 
receivers data buffer. 

Advantageously, the invention can be used for all kinds of 
devices that receive data streams composed of one or more 
control streams and one or more multimedia data streams, and 
wherein a certain type of information is divided into 
different blocks sent at different times. Particularly these 
are broadcast receivers and all types of music rendering 
devices . 
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The invention is particularly good for receivers for MPEG- 4 
streaming systems. 
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Claims 



1. Method for decoding a data stream, the data stream 
containing a first and a second substream, the first 

5 substream (14) containing multimedia data packets and the 

second substream containing control information (10) , 
wherein the multimedia data packets contain an indication 
of the time when to be presented, and wherein the 
multimedia data packets are decoded prior to the 

10 indicated presentation time, characterized in that 

- first decoded multimedia data packets <*re buffered at 
least until, after a further processing, they can be 
presented in due time; and 

- other multimedia data packets may be buffered, wherein 
15 the other multimedia data packets may replace or be 

appended to the first decoded multimedia data packets. 

2. Method according to claim 1, wherein the control 
information defines whether the other multimedia data 

20 packets are appended to the first data packets or replace 

them. 

3. Method according to claim 1 or 2, wherein the control 
information contains first, second and third control 

25 data, 

- the first control data (Length) defining the allocated 
buffer size, 

- the second control data (LoadMode) defining whether the 
other multimedia data packets are appended to the first 

30 packets or replace them, and 

- the third control data (StartLoadTime, StopLo^dTime) 
defining a multimedia data packet to be buffered. 
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4. Method according to any of claims 1-3, wherein labels are 
attached to the buffered first and other multimedia data 
packets, and the packets may be accessed through their 
respective label . 

5. Method according to any of claims 1-4, wherein the labels 
attached to the buffered data packets contain an index 
relative to the latest received data packet. 

6. Method according to any of claim 1-5, wherein the first 
substream contains audio data and the second substream 
contains a description of the presentation. 

7. Method according to any of claims 1-6, wherein the data 
stream is compliant with the MPEG- 4 standard. 

8. Apparatus comprising means for decoding a data stream, 
the data stream containing a first and a second 
substream, the first substream (14) containing multimedia 
data packets and the second substream containing control 
information (10) , wherein the multimedia data packets 
contain an indication of the time when to be presented, 
characterized in that 

- first decoded multimedia data packets are buffered in a 
buffering means; and 

- other multimedia data- packets are buffered in the same 
buffering means, wherein the other multimedia data 
packets may replace or be appended to the first decoded 

' multimedia data packets, depending on the control 
) information (10) . 

9. Apparatus according to claim 8, further comprising means 
for attaching labels to the buffered first and other 
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multimedia data packets, and means for accessing, 
retrieving or deleting the packets through their 
respective label. 

10. Apparatus according to claim a or 9, wherein the data 
stream is an MPEG- 4 compliant data stream. 
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abstract 

A method for decoding a data stream containing audio/video 
substreame (14) and control substreams comprises buffering 
5 nodes (12) having the possibility to buffer multiple data 
packets in the same buffer. This may be achieved by having 
separate parameters for the allocated buffer size and any 
stored packet* Thus, not only multiple packets may be stored 
in the buffering node (12), but also such node may exist while 
10 its buffer is empty, so that the node may be reused later. 
This is particularly useful for buffering and selectively 
accessing multiple audio packets in MPEG-4 audio nodes or 
sound nodes. 

15 

' Fig. 2 
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AdvancedAudioBuf fer 
f 


Field type 


Data type 


Name 


Default 


Range 


ExposedField 


SFBool 


Loop 


FALSE 




ExposedField 


SFFloat 


Pitch 


1 . 6 




ExposedField 


SFTime 


StartTime 


0 




ExposedField 


SFTime 


StopTime 


0 




EixposedField 


SFTime 


S t art LoadTime 


0 




ExposedField 


SFTime 


StopLoadTime 


0 




ExposedField 


SFlnt32 


LoadMode 


0 


>=0 


ExposedField 


SFInt32 


NumAc cumul a t edB 1 o c ks 


0 


>=0 


ExposedField 


SFInt32 


DeleteBlock 


0 


<=0 


ExposedField 


SFInt32 


PlayBlock 


0 




ExposedField 


MFNode 


Children 


[] 




ExposedField 


SFInt 


NumChan 


1 




ExposedField 


MFInt 


PhaseGroup 


[13 




ExposedField 


SFFloat 


Length 


0.0 




EventOut 


SFTime 


Duration changed 






EventOut 


SFBool 


is Active 






} 
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